AITopics | universal consistency

Collaborating Authors

universal consistency

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Theory and Practice of Highly Scalable Gaussian Process Regression with Nearest Neighbours

Allison, Robert, Maciazek, Tomasz, Stephenson, Anthony

arXiv.org Machine LearningApr-9-2026

Gaussian process ($GP$) regression is a widely used non-parametric modeling tool, but its cubic complexity in the training size limits its use on massive data sets. A practical remedy is to predict using only the nearest neighbours of each test point, as in Nearest Neighbour Gaussian Process ($NNGP$) regression for geospatial problems and the related scalable $GPnn$ method for more general machine-learning applications. Despite their strong empirical performance, the large-$n$ theory of $NNGP/GPnn$ remains incomplete. We develop a theoretical framework for $NNGP$ and $GPnn$ regression. Under mild regularity assumptions, we derive almost sure pointwise limits for three key predictive criteria: mean squared error ($MSE$), calibration coefficient ($CAL$), and negative log-likelihood ($NLL$). We then study the $L_2$-risk, prove universal consistency, and show that the risk attains Stone's minimax rate $n^{-2α/(2p+d)}$, where $α$ and $p$ capture regularity of the regression problem. We also prove uniform convergence of $MSE$ over compact hyper-parameter sets and show that its derivatives with respect to lengthscale, kernel scale, and noise variance vanish asymptotically, with explicit rates. This explains the observed robustness of $GPnn$ to hyper-parameter tuning. These results provide a rigorous statistical foundation for $NNGP/GPnn$ as a highly scalable and principled alternative to full $GP$ models.

artificial intelligence, assumption, machine learning, (15 more...)

arXiv.org Machine Learning

2604.07267

Country:

Europe > United Kingdom (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

f340f1b1f65b6df5b5e3f94d95b11daf-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 02:27:57 GMT

assumption, hscic, kernel, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

f340f1b1f65b6df5b5e3f94d95b11daf-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 06:30:07 GMT

artificial intelligence, assumption, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

On the Universal Statistical Consistency of Expansive Hyperbolic Deep Convolutional Neural Networks

Ghosh, Sagar, Bose, Kushal, Das, Swagatam

arXiv.org Machine LearningNov-15-2024

The emergence of Deep Convolutional Neural Networks (DCNNs) has been a pervasive tool for accomplishing widespread applications in computer vision. Despite its potential capability to capture intricate patterns inside the data, the underlying embedding space remains Euclidean and primarily pursues contractive convolution. Several instances can serve as a precedent for the exacerbating performance of DCNNs. The recent advancement of neural networks in the hyperbolic spaces gained traction, incentivizing the development of convolutional deep neural networks in the hyperbolic space. In this work, we propose Hyperbolic DCNN based on the Poincar\'{e} Disc. The work predominantly revolves around analyzing the nature of expansive convolution in the context of the non-Euclidean domain. We further offer extensive theoretical insights pertaining to the universal consistency of the expansive convolution in the hyperbolic space. Several simulations were performed not only on the synthetic datasets but also on some real-world datasets. The experimental results reveal that the hyperbolic convolutional architecture outperforms the Euclidean ones by a commendable margin.

architecture, convolution, hyperbolic space, (14 more...)

arXiv.org Machine Learning

2411.10128

Country:

Oceania > Australia > Tasmania (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (0.82)

Industry: Energy (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes

Ko, Hyunouk, Huo, Xiaoming

arXiv.org Machine LearningJan-8-2024

In this paper, we first extend the result of FL93 and prove universal consistency for a classification rule based on wide and deep ReLU neural networks trained on the logistic loss. Unlike the approach in FL93 that decomposes the estimation and empirical error, we directly analyze the classification risk based on the observation that a realization of a neural network that is wide enough is capable of interpolating an arbitrary number of points. Secondly, we give sufficient conditions for a class of probability measures under which classifiers based on neural networks achieve minimax optimal rates of convergence. Our result is motivated from the practitioner's observation that neural networks are often trained to achieve 0 training error, which is the case for our proposed neural network classifiers. Our proofs hinge on recent developments in empirical risk minimization and on approximation rates of deep ReLU neural networks for various function classes of interest. Applications to classical function spaces of smoothness illustrate the usefulness of our result.

artificial intelligence, kolmogorov-donoho optimal function class, machine learning, (3 more...)

arXiv.org Machine Learning

2401.04286

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Universal consistency of the $k$-NN rule in metric spaces and Nagata dimension. II

Kumari, Sushma, Pestov, Vladimir G.

arXiv.org Artificial IntelligenceDec-30-2023

We continue to investigate the $k$ nearest neighbour learning rule in separable metric spaces. Thanks to the results of C\'erou and Guyader (2006) and Preiss (1983), this rule is known to be universally consistent in every metric space $X$ that is sigma-finite dimensional in the sense of Nagata. Here we show that the rule is strongly universally consistent in such spaces in the absence of ties. Under the tie-breaking strategy applied by Devroye, Gy\"{o}rfi, Krzy\.{z}ak, and Lugosi (1994) in the Euclidean setting, we manage to show the strong universal consistency in non-Archimedian metric spaces (that is, those of Nagata dimension zero). Combining the theorem of C\'erou and Guyader with results of Assouad and Quentin de Gromard (2006), one deduces that the $k$-NN rule is universally consistent in metric spaces having finite dimension in the sense of de Groot. In particular, the $k$-NN rule is universally consistent in the Heisenberg group which is not sigma-finite dimensional in the sense of Nagata as follows from an example independently constructed by Kor\'anyi and Reimann (1995) and Sawyer and Wheeden (1992).

classifier, dimension, metric space, (14 more...)

arXiv.org Artificial Intelligence

2305.17282

Country:

North America > United States > New York (0.04)
South America > Brazil > Santa Catarina > Florianópolis (0.04)
South America > Brazil > Paraíba > João Pessoa (0.04)
(5 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)

Add feedback

Universal Regression with Adversarial Responses

Blanchard, Moïse, Jaillet, Patrick

arXiv.org Artificial IntelligenceJun-9-2023

We provide algorithms for regression with adversarial responses under large classes of non-i.i.d. instance sequences, on general separable metric spaces, with provably minimal assumptions. We also give characterizations of learnability in this regression context. We consider universal consistency which asks for strong consistency of a learner without restrictions on the value responses. Our analysis shows that such an objective is achievable for a significantly larger class of instance sequences than stationary processes, and unveils a fundamental dichotomy between value spaces: whether finite-horizon mean estimation is achievable or not. We further provide optimistically universal learning rules, i.e., such that if they fail to achieve universal consistency, any other algorithms will fail as well. For unbounded losses, we propose a mild integrability condition under which there exist algorithms for adversarial regression under large classes of non-i.i.d. instance sequences. In addition, our analysis also provides a learning rule for mean estimation in general metric spaces that is consistent under adversarial responses without any moment conditions on the sequence, a result of independent interest.

artificial intelligence, machine learning, value space, (16 more...)

arXiv.org Artificial Intelligence

2203.05067

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Universal Consistency of Multi-Class Support Vector Classification

Neural Information Processing SystemsApr-6-2023, 13:18:14 GMT

Steinwart was the first to prove universal consistency of support vector machine classification. His proof analyzed the'standard' support vector machine classifier, which is restricted to binary classification problems. In contrast, recent analysis has resulted in the common belief that several extensions of SVM classification to more than two classes are inconsistent. Our proof extends Steinwart's techniques to the multi-class case.

multi-class support vector classification, steinwart, universal consistency

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Contextual Bandits and Optimistically Universal Learning

Blanchard, Moise, Hanneke, Steve, Jaillet, Patrick

arXiv.org Artificial IntelligenceDec-31-2022

We consider the contextual bandit problem on general action and context spaces, where the learner's rewards depend on their selected actions and an observable context. This generalizes the standard multi-armed bandit to the case where side information is available, e.g., patients' records or customers' history, which allows for personalized treatment. We focus on consistency -- vanishing regret compared to the optimal policy -- and show that for large classes of non-i.i.d. contexts, consistency can be achieved regardless of the time-invariant reward mechanism, a property known as universal consistency. Precisely, we first give necessary and sufficient conditions on the context-generating process for universal consistency to be possible. Second, we show that there always exists an algorithm that guarantees universal consistency whenever this is achievable, called an optimistically universal learning rule. Interestingly, for finite action spaces, learnable processes for universal learning are exactly the same as in the full-feedback setting of supervised learning, previously studied in the literature. In other words, learning can be performed with partial feedback without any generalization cost. The algorithms balance a trade-off between generalization (similar to structural risk minimization) and personalization (tailoring actions to specific contexts). Lastly, we consider the case of added continuity assumptions on rewards and show that these lead to universal consistency for significantly larger classes of data-generating processes.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.00241

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.63)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.86)

Add feedback

Universally Consistent Online Learning with Arbitrarily Dependent Responses

Hanneke, Steve

arXiv.org Machine LearningMar-11-2022

This work provides an online learning rule that is universally consistent under processes on(X, Y) pairs, under conditions only on the X process. As a special case, the conditions admit all processes on (X,Y) such that the process on X is stationary. This generalizes past results which required stationarity for the joint process on(X, Y), and additionally required this process to be ergodic. In particular, this means that ergodicity is superfluous for the purpose of universally consistent online learning.

hanneke, online, sequence, (13 more...)

arXiv.org Machine Learning

2203.06046

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.95)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback